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Abstract 



We discuss the development of cluster algorithms from the viewpoint of prob- 
ability theory and not from the usual viewpoint of a particular model. By 
using the perspective of probability theory, we detail the nature of a cluster 
algorithm, make explicit the assumptions embodied in all clusters of which we 
are aware, and define the construction of free cluster algorithms. We also illus- 
trate these procedures by rederiving the Swendsen-Wang algorithm, present- 
ing the details of the loop algorithm for a worldline simulation of a quantum 
S = 1/2 model, and proposing a free cluster version of the Swendsen-Wang 
replica method for the random Ising model. How the principle of maximum 
entropy might be used to aid the construction of cluster algorithms is also 
discussed. 
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I. INTRODUCTION 

The development of cluster Monte Carlo algorithms by Swendsen and Wang [1] and 
other researchers was a significant advance in the way which computer simulations of the 
equilibrium properties of physical systems are implemented. These algorithms reduce the 
long auto-correlation times that occur as the simulations move toward a critical point. More 
recently, other algorithms were developed to reduce similar long times inherent to other 
simulations even though they are far from finite-temperature critical points [2,3]. Inspired 
by the work of Kandel and Domany [4], who gave a relatively general interpretation to 
cluster algorithms, we now propose a different perspective and also highlight several essential 
ingredients for developing cluster algorithms. 

One of our purposes is to detail the number of natural and reasonable assumptions 
embodied to all cluster algorithms to-date. Our hope is to establish a framework for a 
more general use and more general development of cluster algorithms. Such a framework is 
needed. The approach of Kandel and Domany fits naturally onto classical systems defined 
on lattices. As such, their specifying an effective Hamiltonian and local interaction energies 
is quite constructive, but is unnatural if applied to quantum systems. The reason for this 
difference is directly traceable to the Hamiltonian in quantum mechanics being an operator 
and not a scalar function. 

As observed by Kandel and Domany, the most effective cluster algorithms are ones in 
which the interactions between clusters vanish. We call these algorithms free cluster algo- 
rithms. Again, for quantum systems, viewing the clusters as interacting or non-interacting 
may be unnatural. We feel that it is best to focus on the configuration weights and directly 
define procedures to construct clusters that can be fiipped independently. We will show 
that such new algorithms follow if a specific system of linear equations has a non-negative 
solution This system is in general underdetermined, and because underdetermined systems 
generally have an infinite number of solutions, if any solutions exist, finding several non- 
negative ones may occur. The key issue is then not the existence of a solution but rather 
the selection of an optimal solution, in the sense of computational efficiency. Our emphasis 
is defining the general structure of cluster algorithms. At this time, we only can provide 
standard suggestions for efficient algorithm selection. 

In defining a cluster algorithm, we will argue that the standard cluster algorithm is a 
form of a dual Monte Carlo process. This form of Monte Carlo is a more general Monte Carlo 
process in which the configuration and labeling are viewed jointly. The joint probability for 
a configuration and a label can be expressed in terms of conditional probabilities for a label 
given a configuration and vice versa. 

The plan of the paper is to establish notation for basic probabilities and Monte Carlo 
concepts in Section II. In Section III, we define what we mean by a cluster algorithm. In 
Section IV, we will illustrate the formalism in three different contexts: the Swendsen- Wang 
(SW) algorithm for a ferromagnetic Ising model, a cluster algorithm for the anisotropic 
quantum S = 1/2 quantum system, and the Swendsen- Wang replica method [2] for the 
random Ising model. In Section V, we conclude by discussing points for further investigation. 
Here, we will emphasize that they are many ways to construct cluster algorithms, but at 
this time we are unaware of any a priori way to insure one algorithm is optimal compared to 
others. We emphasize that our formalism is merely a procedure, as is the Kandel-Domany 



procedure, to start the construction of cluster algorithms. This procedure will help, but 
will not define how, to specify the labeling probabilities. We will discuss in the Appendix, 
however, how the principle of maximum entropy might be used to help accomplish this task. 



II. BACKGROUND 

The common way to introduce the Monte Carlo method for simulations of the equilibrium 
properties starts with specifying a functional form H^(yl) for the Boltzmann weight of states 
A of the system and for the transition probability T[A -^ A') to carry A to a new state A'. 
To insure the states are produced with the correct weight, the transition probabilities are 
almost always chosen so that the condition of detailed balance holds. This condition is 

W{A)T{A -^ A') = W{A')T{A' -^ A) (2.1) 

We will express (2.1) as 

Pr{A'\A) Pr{A) = Pr{A\A') Pr{A') (2.2) 

where Pr(A) is the probabihty of A where Pr(A) = W{A)/Z with Z = Y^^ W{A) being the 
partition function, and Pr(yl'|yl) = T[A -^ A') is the (conditional) probability of A' given 
A. 

Although (2.2) seems the same as Bayes's theorem [5], its implication is different. To be 
more precise about the statement of detailed balance, if the Monte Carlo process produces a 
sequence of states . . . , X^-i, X^, X^+i, . . ., then we can write the detailed balance condition 
as 

Pr(X„+i = A'lXr, = A) Pr(X„ = A) = Pr(X„+i = A|X„ = A') Pr(X„ = A') (2.3) 

On the other hand, Bayes's theorem is expressed as 

Pr(X„ = A'lXr, = A) Pr(X„ = A) = Pr(X„ = A|X„ = A') Pr(X„ = A') (2.4) 

where n and m are any pair of steps along the Markov chain. Bayes's theorem follows for 
the standard relation between joint and condition probabilities 

Pr(yl, A') = Pr{A'\A) Pr{A) (2.5) 

We will use the more detailed notation whenever we feel the distinction between the two 
conditions needs emphasis. The main point is however that in both cases we are dealing 
with probabilities. Thus, a number of relations automatically hold. The detailed balance 
condition is a rather special constraint imposed on the probabilities while Bayes's theorem 
is generally applicable to any conditional probability. 

A. Standard Monte Carlo 

A Monte Carlo simulation of an equilibrium process approximates the sum over all states 
of a system by a sum over a smaller set of states chosen with the correct Boltzmann weight. 



Each state is, in the language of probabihty theory, an event A, and the Monte Carlo process 
seeks to produce these events with a probability Pr(yl). The Markov process in the Monte 
Carlo procedure is defined by the conditional probability Pr(yl'|yl) of state A' given A. For 
this process to produce the states with probability Pr(yl), several conditions must be met 

[6] 

Pr(A) > (2.6) 

EPr(A) = l (2.7) 

A 

J2 Pr{A\A') Ft{A') = Ft{A) (2.8) 

A' 

If A can take N different values, then there are N'^ elements for Pr(yl|yl'). By constituting 
only 0[N) constraints, the above equations illustrate the considerable freedom that exists in 
defining the Monte Carlo process. Typically, TV is a very large number so directly selecting 
Pr(yl|yl') from these equations is not practical. 

Normally, a Monte Carlo process is specified so that it satisfies the detailed balance 
condition (2.2), which is a stronger condition than (2.8), but detailed balance still does 
not uniquely define Pr(yl'|yl). The transition probability is usually defined, in one of two 
ways, in terms of the ratio Pr(yl')/ Pr(yl). These different ways define the Metropolis and 
symmetric algorithms [7]. For a simple version of the symmetric algorithm 

Ft{A'\A) = R/{1 + R), (2.9) 

where R = 'Pv{A')/'Pv{A). 

The heat-bath algorithm defines Pr(yl'|yl) in still another way. In the heat-bath algorithm 
[8], one imagines the given state A being placed in contact with a heat-bath and allowed 
to fiuctuate through various states in a manner consistent with the Boltzmann distribution 
Pr(yl). After a while, when the heat-bath is removed, this system is left in some new state 
A' with a probability Pr(yl'). The key feature is that the new state is chosen in a manner 
independent of the current state, i.e., Pr(yl'|yl) = Pr(yl'). We will call a heat-bath algorithm 
those algorithms that chose the new state independently of the old state. 



B. Dual Monte Carlo 

In a standard Monte Carlo algorithm, the states of the system are most naturally viewed 
in terms of the local variables in the Hamiltonian. These variables might simply be the values 
of the Ising spins at each site in a lattice, the positions of gas atoms in a box, electrons on 
lattice sites, etc. With these variables, a Monte Carlo process, as described above, is created 
by specifying the transition probability Pr(yl'|yl) where A' is obtained from A, for example, 
by a single fiip of the Ising spin at a given lattice site, the displacement of a single gas 
atom, etc. A number of years ago, it was recognized that the Monte Carlo process may be 
enhanced by introducing another set of events and performing the Markov process in a joint 
space [6,9] . We will adopt this modification and argue that cluster algorithms are a special 
case of a dual Monte Carlo process. 

To develop our viewpoint, we first remark that a standard cluster algorithm starts with 
a state Xq = A and labels it as Y^ = 5 by some prescription. The label defines clusters. 



and the clusters are then flipped to produce a new state A' that can be labeled as B'. The 
process is cycled to produce the sequence 

...^A^B^A'^B'^A"^B"^... (2.10) 

A cluster algorithm, however, can be also viewed more generally as the sequence 

. . . ^ (A, 5) ^ {A', B') -^ {A", B") -^ ... (2.11) 

From this point of view, we would want to construct a Markov process that produces 
Pr(yl, 5) = lim„_>oo Pr(X„ = A,Yn = B) as its limiting probability, i.e., the transitions 
are viewed as from [A, B) to [A', B') and not just from A to A'. Several ways exists to 
produce this sequence. One such way is to specify transition probabilities Fr^A'lA, B) and 
Fr^B'lA, B) which satisfy the following extended detailed balance condition 

Pr(X„+i = A'lXr, = A,Yr, = B) Pr(X„ = A, F^ = 5) = 

Pr(X„+i = A\Xr. = A', Yr, = B) Pr(X„ = A\ F^ = B) (2.12) 

Pr(r„+i = 5'|X„+i = A\ Yr, = B) Pr(X„+i = A\ F. = 5) = 

Pr(r„+i = 5|X„+i =A,Y^ = B') Pr(X„+i = A\ Y^ = B') (2.13) 

Because at equilibrium the joint probability is the same for any pair of Monte Carlo steps, 
we rewrite the above equations more compactly as 

Pr(X„+i = A'\Xr. = A,Yr. = B) Pr(A, B) = 

Pr(X„+i = A\X^ = A\ Y^ = B) Pr(A', B) (2.14) 

Pr(r„+i = 5'|X„+i = A\Yr, = B)Ft{A',B) = 

Pr(r„+i = 5|X„+i = A, r„ = B') Ft{A', B') (2.15) 

The transition probabilities Vi{^A\A' ,B) and Vi{^B\A,B') specify the algorithm. They 
must satisfy 

Pr(yl, B) = Y, Pr(A|yl', B) Pr(yl', 5) = ^ Pr(5|yl, B') Pr(yl, B') (2.16) 

A' B' 

In addition, we must also have 

Pr(A) = ^Pr(A,5) (2.17) 

B 

to produce the desired Boltzmann weight. We also have that 

Pr(5) = ^Pr(yl,5) (2.18) 

A 

In a cluster algorithm, we seek to exploit the freedom we have in the choice of Pr(5) to 
produce an efficient and effective Monte Carlo procedure. 



C. Heat-Bath Transition Probabilities 

While equations (2.14) and (2.15) express an elegant duality between the two sets of 
events, in a cluster algorithm these equations are used with several implicit assumptions. 
Most cluster algorithms of which we are aware implicitly assume that Fr^A'lA, B) is inde- 
pendent of A and Pr(5'|yl,5) is independent of B. We will adopt these assumptions and 
refer to the resulting algorithms of being of the heat-bath-type. 

With the heat-bath assumptions and the use of the basic theorem of joint probabilities, 

Pr(yl, B) = Fv{B\A) Fv{A) = Fv{A\B) Pr(5), (2.19) 

(2.14) and (2.15) reduce to 

Pr(X„+i = A'lYr, = B) Vt[A\B) = Pr(X„+i = A\Yr, = B) Vt[A'\B) (2.20) 

Pr(r„+i = 5'|X„+i = A') Vr{B\A') = Pr(r„+i = 5|X„+i = A') Vr{B'\A') (2.21) 

It follows that 

Pr(X„+i = A\Yr. = B) = Pr(A|5), (2.22) 

Pr(r„+i = 5|X„+i = A) = Pr(5|A), (2.23) 

which means that we should choose the transition probabilities so they agree with the lim- 
iting conditional probabilities. Therefore these equations, and hence the algorithm, depend 
only on the conditional probabilities Pr(yl|5) and Pr(5|yl). They define a dual algorithm 
of the heat-bath type that produces Pr(yl) as the limiting distribution of the Markov chain. 
If we are given Pr(yl,5), the transition probabilities Pr(yl|5) and Pr(5|yl) are easily 
found. Usually, we are given Pr(yl) and specify Pr(5|yl). By (2.19), these two quantities are 
sufficient to specify Pr(yl,5). Having just Pr(yl) and Pr(yl|5) is, in general, insufficient to 
fix Pr(yl, B), but as we discuss below, situations exist where we can proceed in this manner 
and at the same time achieve considerable advantage in constructing special classes of cluster 
algorithms. 

III. CLUSTER ALGORITHMS 

A. Local Labeling and Other Background 

In most existing cluster algorithms, the labeling of the whole system is done by labeling 
individual local units. In this subsection, we state this process in general terms. First, 
we consider systems where the given state A can be represented by a set of L local units 
described by the variable a^ 

A = {ai,a2,- •• ,«!-}• (3.1) 

Each local units may consist of several local elements, each described by a variable Si. Hence, 
we can also express the state A by the set 

^ = {■Sl,52,- • • ,5Ar}. (3.2) 



where L < N. 

For the Ising model, the local units could be bonds or plaquettes, for example. If the 
local units were the bonds, the local elements would be the lattice sites on the bond. For the 
Ising model, Si is usually a two-state (one-bit) site variable whose values are a member of 
the set {+1, —1}, and the values of a^ are one of the four possible states that the bond i may 
assume, i.e., the value of a^ is a member of the set {( — 1, —1), (+1) +1)) (~1) +1)) (+1) ~1)}- 
The Ising model example also illustrates that in general a local element belongs to multiple 
local units, as a given site usually belongs to more than one bond. 

We will restrict ourselves to systems for which we can write H^(yl) = Yliw[ai). In many 
respects, this condition is not very restrictive. The factorization is true for most classical 
Monte Carlo simulations and for some quantum Monte Carlo simulations such as those using 
the worldline method. In the worldline quantum Monte Carlo method for a one-dimensional 
system of electrons, for example, the local unit for an electron of a given electron spin is a 
plaquette which can have 16 different states of electron occupancy at its corners but only 6 
of these states are consistent with the conservation of electron number. By considering only 
allowed states, one can generally express H^(yl) in factored form. 

Cluster algorithms generally assume that one can express B in terms of a set of local 
labels bi 

B = {b,,b2,...,bL} (3.3) 

The role of these labels, assigned one to each local unit, is to define the clusters, and 
the values of bi by choice generally assume only a finite number of values. In the SW 
algorithm, the local labels of "frozen" or "deleted" are assigned to bonds. For the loop- 
flip algorithm for the worldline quantum Monte Carlo method, the local labels are pairs 
of line segments assigned the shaded plaquettes of the system. The two different segments 
sometimes belong to two different clusters (loops). Cluster algorithms also generally assume 

B. Clustering 

The essence of cluster algorithms is the changing the value of a set of many local elements, 
not the local units, in a coherent manner. Using our language of dual Monte Carlo, we will 
now deflne a cluster algorithm more explicitly. In what follows, we only consider the case 
where both the state space and the label space are discrete. The generalization to continuous 
variables is, for the most part, straightforward. 

We start by deflning a clustering C as a function which maps the set of the serial numbers 
of the local elements J\f = {1,2, . . . ,7V} (i.e., sites, bonds, plaquettes, etc.) to a set J\fc of 
the serial numbers of the clusters {1, 2, • • • , Nc}. The integer Nc is the number of clusters 
and is less than or equal to N. The value of C[i) is the serial number of the cluster to which 
the local element i belongs. The labeling process provides the basis for constructing this 
function. 

Many ways to deflne clusters exist. For a collection of sites, one can create a graph by 
drawing a number of lines connecting pairs of sites. Those sites connected to each other, 
but disconnected from all other sites, form a cluster. A speciflc graph may contain several 



clusters. For a given number of sites, many different graphs may exist. Such a graphical 
procedure is at the heart of the Mayer cluster expansion often used in analytic studies of 
real gases [10]. 

In the SW algorithm, something more complicated is done. The connection (the label 
"a frozen bond") between two sites depends on the combined state of the Ising variables 
at those sites. Only sites mutually aligned are connected and then only with a probability 
chosen to insure when the cluster is flipped that the detailed balance condition is satisfled. 
In the SW algorithm, sites in a cluster all have the same value of the Ising spin, and the 
flipping process is easily deflned as changing the value of spin for all sites in the cluster. For 
the Ising model, the original Ising spins, each of which has the value of 1 or —1, map to 
a set of cluster spins, each of which also has the value of 1 or —1. The beauty of the SW 
algorithm is that after this mapping the cluster spins do not interact. In summing over all 
possible conflgurations of clusters spins, one can place each cluster into any one of its two 
possible states. 

In other algorithms, even more complicated things are done. In Kandel-Domany algo- 
rithm for the fully frustrated Ising model [3], the Evertz-Luna-Marcu algorithm for the six 
vertex model [11], and the recent algorithms for the worldline quantum Monte Carlo method 
[12,13], the natural local unit is a plaquette which consists of some local elements (sites), 
and the local labels are a set of lines connecting these sites pairwise. In these algorithms, 
loops replace clusters. Similar to the SW algorithm, each loop has one-bit degree of freedom, 
i.e., +1 or —1, which we denote by Xi where i specifles the loop. In contrast to the SW 
algorithm, not all the sites on the loop have the same state of plus or minus one, occupied 
or unoccupied, etc.. Therefore, the state of a site in the loop does not necessarily have the 
same value as the cluster variable Xi. Flipping the loop takes its state from one value of 
Xi to another. What the above algorithms share with the SW algorithm is one-bit cluster 
variables x^, the stochastic assignment of the labels, the free flipping of the clusters, and the 
maintenance of detailed balance. 

Even more complicated situations can exist. Almost all cluster or loop algorithms to-date 
deal with local states that have only two possible values: spin up or spin down, occupied 
or unoccupied, etc. Clearly, richer physical models may have more than two values and 
may lead to a very rich parameterization of the state of individual clusters or loops. In this 
paper, we will be implicitly addressing quite general forms of clusterings. We assume that 
for a given state A (3.1), we can assign a label B (3.3), from which we can form Nc clusters. 
Each cluster can be assigned a variable Xi to describes its "state". In general, this cluster 
variable is not necessarily a one-bit variable. For a given cluster, the value of Xi is just one 
from a set of values assigned to the allowed states of the cluster. 

We deflne $(-B) as the set of all allowed conflgurations consistent with the label B, i.e., 
$(5) = {yl|Pr(yl,5) ^ 0}. In any cluster algorithm, for the global label B, an arbitrary 
state in $(-B) is specifled by a set of cluster variables 

X = {xi,X2,...,XNc}, (3.4) 

in such a way that the state of a local element depends only on the cluster variable Xi where 
i specifles the cluster to which the element belongs. In a more formal language, the essence 
of clustering is represented by the existence of some one-to-one mapping /^ that maps the 
set of cluster conflgurations X onto $(-B). Of course, such a mapping depends on the label 
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B. What we will call a cluster Monte Carlo algorithm is a dual Monte Carlo process defined 
by (2.14) and (2.15) where at least one non-trivial mapping /^ exists for almost all B for 
which Pr(5) > 0. Here, a trivial mapping, which exists for any $(-B), is the one in which 
there is only one cluster. 

C. Special cluster algorithms 

In what follows, we will mainly be concerned with the question, "Can we create a cluster 
Monte Carlo algorithm in which the clusters can be flipped independently?" because we 
expect that such a cluster algorithm is advantageous both for the computational simplicity 
and for reducing autocorrelation times. 

As stated before, a deflning property of a cluster algorithm is the limited range of states 
to which the flnal state A is restricted by the label B. We can express this situation by 
writing the conditional probability Pr(yl|5) as 

Vi{A\B) = WXA)A{A, B)/N{B) (3.5) 

where the weight function H^'(yl) is to be specifled, N[B) is the number of members of 
$(5), and A{A,B) is deflned by 

This deflnition for Pr(yl|5) is not unique, but it has to be chosen so that it satisfles such 
relations as 

l = j:Pr{A\B) (3.7) 

A 

Fv{A) = J2 Pr(A|5) Pr(5) (3.8) 

B 

that are intrinsic to conditional probabilities. 

In the language of the last subsection, the conditional probability Pr(yl|5) means that 
after assigning a label B, we should pick a value X with probability 

Pr(X) ^ W'ifiX))/ Y: W'ifiX)). (3.9) 

X 

Then, we map X into $(-B) with the function /^ to obtain the flnal state A'. We comment 
that conditional probabilities cannot in general be written in the form of (3.5). In other 
words, when solutions exist, (3.5) deflnes a special class of cluster algorithms. To our 
knowledge, existing cluster algorithms belong to this class. 

We can obtain an important additional subclass of cluster algorithms by setting H^'(yl) = 
1. This selection implies that for any cluster parameterization /^ of $(-B), Pr(/^(X)|5) 
is just a constant which is independent of X. Therefore, we can generate the flnal state 
A consistent with the label B by picking with equal probability a set X at random, and 
then mapping this set into $(-B) by the function /^. Because of the clustering, the state 
of the system is a direct product of the states of the clusters, we pick an X by picking 



independently and uniformly the Xi from the set of possible states of cluster i and collecting 
these values to form X = {xi,X2, ■ ■ ■ ,xn^}. Roughly speaking, the algorithms in this 
subclass are characterized by clusters that do not interact with each other. 

As already noted, in a dual Monte Carlo algorithm of heat-bath-type, all we need to 
specify is Pr(yl,5). To do this for the class of algorithms characterized by (3.5), we start 
by writing Pr(A) = W{A)/Z where Z = Y.A W{A) and then replacing Pr(A|5) in (2.17) by 
(3.5), we have 

J2^{A,B)V{B) = Wo{A) (3.10) 

B 

where the weight H^(yl) is decomposed as 

W{A) = Wo{A)W'{A) (3.11) 

and 

V{B) = Z Ft{B)/N{B). (3.12) 

The first factor Wq is used in determining the labeling probability; the second factor W, 
the flipping probability of clusters. Once we choose H^'(yl) and determine l/Fo(A) by (3.11), 
equation (3.10) can be viewed as a linear equation with undetermined variables V[B). We 
emphasize that neither the uniqueness of the solution to (3.10) or its existence is guaranteed. 
If we obtain a solution for V[B) which satisfles (3.10), we can calculate Pr(5|yl) by 

Fv{A\B)'Pv{B) 



Ft{B\A) 



Pt{A) 
[W\A)A{A, B)/N{B)][V{B)N{B)/Z] 

W{A)/Z 
A{A,B)V{B) 



(3.13) 



Wo{A) ■ 
In this way, we can determine both Pr(yl|5) and Pr(5|yl) for a given weight H^(yl). 

D. Cluster algorithms with local labeling rules 

We have reduced the task of constructing a cluster algorithm to solving (3.10) for V[B), 
but we still have too many degrees of freedom in the algorithm to flx since in many cases the 
dimensionality of the label space on which Pr(5) is deflned is by far greater than that of the 
conflguration space. Therefore, to reduce further the degrees of freedom in the algorithm 
to make the problem tractable, we will consider a situation where H^(yl), H^'(yl), A[A,B), 
and V[B) can be decomposed into products of local factors. 

W{A) = ]lw{a,), (3.14) 

i 

WXA) = Y[wXa,), (3.15) 

i 

A{A,B) = ]l6{a„b,), (3.16) 

i 

V{B) = llv{b,). (3.17) 
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In other words, we are considering cluster algorithms where the rules for generating a new 
label B' for the whole system are given in terms of a collection of rules for the local elements. 
This situation is the case, for example, in the Swendsen-Wang algorithm for classical Potts 
models. Now, we can arrive at a set of equations with the number of degrees of freedom of 
order 0{1) 

J2K<^,bHb) = wo{a) (3.18) 

b 

where Wo[a) = w[a)/w'[a). This local equation is of central importance to this paper. Once 
we get a solution v[b) of this equation, we can obtain the transition probability Pr(5|yl) as 
follows 

Pi:{B\A) = l[Pi:{k\a,) (3.19) 

i 

where 

Pr(%) = ^KM^. (3.20) 

Wo{a) 

As we have seen, once a label b is chosen for a local unit, the state a of this unit can have 
only the values allowed by the matrix 6[a, b); that is, only the a's for which 6[a, b) ^ are 
possible. In order that this restriction on a leads to clusters in the whole system, the label b 
must represents some clustering of the local unit, i.e., 6[a, b) has to satisfy some condition. 
In other words, b must be such a label that breaks up local elements in the local unit into 
several groups and locks the elements in each group into a single degree of freedom. 

IV. APPLICATIONS 

We now discuss several cluster algorithms from the point of view just developed. All 
these algorithms will be ones with local labeling rules. 

A. Swendsen-Wang algorithm for the Ising model 

The Swendsen-Wang algorithm [1] is an example of a cluster algorithm with a local 
labeling rule which is a free cluster algorithm if the external magnetic field is zero. We will 
derive an algorithm for non-zero magnetic field that will reduce to the zero-field Swendsen- 
Wang algorithm. The Hamiltonian is 

n=-JJ2S^SJ-HJ2S^ (4.1) 

with Ising variables Si = ±1. For this model, we take the local elements to be the lattice 
sites, the local units to be the bonds specified by (i,i), and the local variables Si to be the 
Si. The other local variables a^ have at each i one of four values ( — 1,-1), (1,1), ( — 1,1), 
and (1, —1) which we will identify as 1, 2, 3, and 4. These values are the four allowed spin 
orientations on the bond. 

11 



The decomposition of the Boltzmann weight H^(yl) = exp(— /37Y) is 

W{A) = Wo{A)W'{A) (4.2) 

where 

W'{A) = l[exp{^HS,), (4.3) 

i 

Wo{A) = n exp(/3J5,5,) = n H<^{^,J))■ (4.4) 

Here, the local weight w[a) is 

w{l)=w{2)=r, w{3) = w{4:) = r-\ (4.5) 

where r = exp(/3J)). Thus, there are only two possible break-up operations, binding the 
two sites or not binding them. Accordingly, there are three possible local labels: the label 
6 = 1 has 6[a, b) = 1 when a = 1, and 2, and the label 6 = 2 has 6[a, b) = 1 when a = 3 and 
4. These two labels correspond to binding two sites. The label 6 = 3 has 6[a, 6) = 1 for all 
a and corresponds to non-binding. 

In matrix form, these three decompositions are 



6{a, b) 



(I 1\ 
1 1 
1 1 

Vo 1 1/ 



(4.6) 



Using the fact that the w{l) = w{2) and w{Z) = t«(4), we can depict this matrix as in Fig. 1. 
Equation (3.18) for the labeling probability reduces to 

^;(l) + ^;(3) = r, 
^;(2) + ^;(3) = r-^ 

and solving these equations, we obtain a set of solutions which depend on a free parameter 

v{l) = r-p, v{2)=r-^-p, v{'i) = p, (4.7) 

where < p < r~^ . This last constraint is necessary to insure f(6) is non-negative. 

What is the best choice for p is, in general, a difficult question. However, the simple 
guideline that in a free cluster algorithm we should make the resulting clusters as small 
as -possible helps us chose a proper value. This guideline is understandable if we note that 
because the flipping probability of every cluster is 1/2, the autocorrelation in the sequence of 
Monte Carlo data seems likely to decrease faster with a large number of small clusters than 
with a small number of large clusters. In the present case, this guideline suggests that we 
choose the largest possible p, i.e., p = r, because ^(3) = p is proportional to the probability 
of "cutting" the connection which seems necessary to promote cluster generation. Thus, our 
flnal result for the local labeling weight v is 
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v{l) = l-r, ^;(2) = 0, v{3) = r. (4.8) 

and by using (3.20), we have 

/I -r 1 -r 0\ 
Pr(6|a) = 0. (4.9) 

V r r 11/ 

If a bond {i,j) is called satisfied when JSiSj is positive and unsatisfied otherwise, this local 
conditional probability tells us to cut all unsatisfied bonds and cut satisfied bonds with 
probability r = e~^^^ . This prescription is nothing but the ordinary clustering rule for the 
Swendsen-Wang algorithm. 

Because of (3.9) and our choice of W (4.3), the fiipping probability of each cluster is 
easily computed. The result is 

%p = -^T^- (4-10) 

Here Wc = exp[/3HMc) where Mc is the magnetization of the cluster whose absolute value 
equals the number of local elements in the cluster. 

B. The Swendsen-Wang replica Monte Carlo method for spin glass systems 

The replica Monte Carlo method [2] was proposed by Swendsen and Wang for spin glass 
systems. Instead of simulating a system sequentially at different temperatures, they treat 
simultaneously several independent replicas of the system at different temperatures. They 
do this by considering a pair of systems at a time. The Hamiltonian for the pair is 

^ = - E -M E J^A'^Sf in < r,\ (4.11) 

where S'j- is an Ising variable and \Ji^j\ = J. A site is specified by two indices {i,fJ.). The 
difference r2 — ri scales the temperature difference between the two systems. 

In the replica Monte Carlo algorithm, a local element is a site (i,/i), and a local unit 
[i,j] is a quartet of sites (z, 1), (i,2), (j, 1) and (j, 2). For this algorithm the decomposition 
(3.11) is 

W'{A) = exp{-(3n) (4.12) 

Wo{A) = 1. (4.13) 

In other words, all the weight is in the fiipping probability of clusters. The algorithm is 
not a free-cluster algorithm. The labeling procedure is deterministic, meaning that given A, 
A[A, B) is non-vanishing (i.e., unity) only for one B and that 

Fv{B\A) = A{A,B) (4.14) 

A{A,B) = l[6{ai,bi). (4.15) 
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On the other hand, (3.5) reduces into 

Ft{A\B) = W{A)A{A, B)/N{B) (4.16) 

and tells us that as long as the final state is allowed under the label B, the transition 
probability is determined by the original weight H^(yl). From (2.9), it follows that the 
probability of fiipping a cluster is given by 

P = R/{1 + R), (4.17) 

where R is the ratio of the weights of the state A before the fiip and the state A' after the 
fiip, i.e., R = W'{A')/W'{A) = W{A')/W{A). 

The labeling rule is: If both bonds (z, 1) — (j, 1) and (z, 2) — (j, 2) are satisfied, or both 
are unsatisfied, we "freeze" the local unit. Here, freezing is the label under which the four 
sites are bound to each other. If one bond is satisfied and the other is unsatisfied, a vertical 
break-up is applied, which means the assignment of the label under which (z, 1) is bound to 
(z, 2) and (j, 1) is bound to (j, 2). 

If we call "red" a pair of sites for which S} S} is negative and call "black" any other 
pair, the resulting clusters are groups of red or black sites surrounded by the other color. 
Flipping a cluster does not change the color of the cluster, since for any 2, the two sites (z, 1) 
and (z, 2) are bound to each other. Therefore, this procedure does not constitute an ergodic 
Markov process. It needs to be combined with an ergodic process to make the entire Markov 
process ergodic. 

We also note that any local unit [i,j] for which i and j belong to different clusters 
has one and only one satisfied bond. Thus, if we fiip a cluster which includes the site 
2, this local unit contributes to the ratio R in (4.17) with a factor exp(2/3J(r2 — ri)) or 
exp(— 2/3J(r2 — Ti)), according to the original orientation of spins. In particular, the fiipping 
probability of a cluster in this algorithm becomes 1/2 when ri = r2. This situation illustrates 
why this algorithm is intended for a system with small r2 — ri: When r2 — ri is large, the 
fiipping probability of a large cluster is in most cases very small, and the algorithm becomes 
ineffective. 



C. An ergodic, free-cluster, replica method 

Following our general procedure of constructing a cluster algorithm, we now will construct 
an ergodic, free-cluster version of the replica Monte Carlo method. For this algorithm, we 
choose the weight H^'(yl) to be unity and Wq = exp(— /37Y). 

We consider four types of break-up operations (Fig. 2). The first one [b = 1) is a 
horizontal break-up in which (i,/i) is bound to (j^fJ-). The second one [b = 2) is where 
(2,2) is bound to (j, 2) but (i, 1) and (i,2) are bound to no site. The third [b = 3,4) is 
the previously used vertical break-up. The last one [b = 5) has no site bound to any other 
site. We note that two labels correspond to a vertical break-up; however, these two labels 
are equivalent because of the symmetry. Therefore, we search for a symmetric solution with 
respect to these two labels, that is, a solution for which ^(3) = '"(4). For the same reason, 
these two labels are represented in Fig. 2 by a single column (the third column). Since the 
matrix 6[a, b) is the upper item of each entry in Fig. 2, the weight equation becomes 
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^;(l) + ^;(2) + ^;(5) = l, 

v{2) + v{3) + v{5) = p, 

v{3) + v{5) = q, 

v{5)=pq, (4.18) 

where p = exp(— 2/3ri) and q = exp(— 2/3r2). The solution is 

v{l) = {l-p){l-q), v{2)=p-q, 

v{3)=v{A) = q{l-p), v{5)=pq. (4.19) 

As a result, the labeling probability is given by the lower item of each entry in Fig. 2. 

As we can see in Fig. 2, when ri = r2, the label 6 = 2 is not assigned to any unit. In 
this case, we can argue that the resulting cluster size is generally smaller than that for the 
Swendsen-Wang replica algorithm: First, we note that no horizontal binding occurs in any 
unit where one bond is unsatisfied and the other is satisfied. This type of bonding also 
occurs in the Swendsen-Wang replica algorithm; however, in the present algorithm, even 
the vertical bonds are missing with a finite probability. Reduction in cluster size follows 
partially from this property. Additionally, no binding is applied to any unit where two 
bonds are unsatisfied while complete freezing is applied to such a unit in the Swendsen-Wang 
case. This second fact reduces the average cluster size even further. Still another reduction 
effect exists: For a unit where two bonds are satisfied, the break-up 6 = 5 is applied only 
with a finite probability while this type of horizontal binding occurs with probability 1 in 
the Swendsen-Wang case. This effect can decrease the cluster size. Therefore, the present 
algorithm has at least two important differences from the previous one in the case where 
ri = r2: ergodicity and smaller clusters. 

On the other hand, when ri < r2, smaller clusters are not guaranteed because of the 
existence of the label b = 2 which has no counterpart in the other algorithm. This label 
is necessary for the present algorithm to be a free cluster algorithm. If we abandon this 
free-cluster property and take the difference r2 — ri into account in the flipping probability 
and not in the labeling probability, then we obtain another algorithm which is similar to the 
Swendsen-Wang algorithm, but is one which is ergodic and produces smaller clusters. We 
believe, however, the free cluster version of the algorithm presented here is at least equivalent 
to this alternative algorithm in terms of efficiency. 

D. A Free cluster algorithm for the S = 1/2 XXZ quantum spin model 

The loop algorithm [11,14] recently proposed for the massless 6-vertex model can be 
applied to the quantum S = 1/2 problem, since in the worldline quantum Monte Carlo for- 
mulation of the problem, configurations with the quantum Boltzmann weight are equivalent 
to those of a special case of 6-vertex model [11]. Because the details of the application of 
cluster methods to quantum spin problems have not been explicitly presented elsewhere in 
detail, we will now give them using the language established in the present paper. 

The Hamiltonian is 

n = - J E[^«^J + ^^D + ^r^|]> (4-20) 
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where erf (/x = x,y,z) is a Pauli operator and the constant A describes the anisotropy of 
the problem. By using the Suzuki-Trotter decomposition formula, we can map the original 
problem into a problem with classical degrees of freedom in the next higher dimension. In 
what follows, we focus on this restatement of the problem. We will refer to the axis along 
the additional dimension as "vertical", and to the other axes as "horizontal;". 

The local element is a site and a local unit is a shaded vertical plaquette. A site is 
specified by a set of indices (i,t) where i specifies the horizontal location and t specifies the 
vertical location. The local variable defined on each site is a one-bit variable with the values 
of or 1, corresponding to a down and an up z-component of spin. A vertical plaquette is 
one formed by four neighboring sites with two vertical edges and the other two edges being 
horizontal. Some fraction of vertical plaquettes are shaded. Which plaquettes are shaded 
depends on both the original lattice and the decomposition of the original weight by the 
Suzuki-Trotter formula. The shaded plaquettes are distributed across the whole lattice in 
such a way that no two local units share an edge (but they do share corners). Since a local 
unit consists of four sites, the variable defined on a unit can, in general, have 16 values. 
For the present problem, however, a local weight is vanishing for some of these values. We 
will represent a local state a^ of a unit p in terms of the four sites {i, t), (j, t), {i, t + 1) and 
(j, t + 1) that belong to p as follows 

ap = {'>T'{i,t),'n(^j,t),'n(^i,t+i),'n(^j,t+i)} (4-21) 

The weight for a local unit (i.e., a shaded plaquette) is non-vanishing only when ?7.(i^t)+n(j^t) = 
'^(i,f+i) + 'T-y.f+i)- As a result, a local unit can have only 6 out of 16 possible states. We 
denote these 6 states by 1,1, 2, 2, 3, and 3. In the site representation stated above, these 
states are 



1 = {0,0,0,0}, 


I = {1, 1,1,1}, 


2 = {1, 0,1,0}, 


2 = {0,1,0,1}, 


3 = {1, 0,0,1}, 


3 = {0,1,1,0}. 



(4.22) 

Besides the constraint mentioned above, another restriction to the space of states with 
non- vanishing weight exists, namely, ?T.(i,f) = '".(i^f+i) when the vertical edge (i,t) — [i,t-\- 1) 
does not belong to any shaded plaquette. The space $ is defined as the set of all states 
{'T'(i,f)} that satisfy these two constraints. 

The Boltzmann factor becomes 

W{A) = l[w{a,), (4.23) 



w[l) = w[l) = exp(ibT), 

10(2) = w{2) = exp(=FT) cosh(2AT), 

10(3) = io(3) = exp(=FT) sinh(2AT), (4.24) 

Here, r is /3\J\/m for a Trotter number of m. In the case of ferromagnetic models (J > 0), 
we take the upper sign. We take the lower sign for antiferromagnetic models (J < 0), if the 
model is on non-frustrated lattices such as a square lattice. 
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First, we consider the ferromagnetic case. The four labels are depicted in Fig. 3(a), and 
the resulting weight equation is 

^;(0) + ^;(2) + ^;(3) = exp(T), 

v{l) + v{3) = exp(-T) cosh(2AT), 

^;(1) + v{2) = exp(-T) sinh(2AT). (4.25) 

The solution is 



(0) = 


= e^ - p. 


(1) = 


= (_p + e(2A-i).)/2, 


(2) = 


= {p - e-(^^+^)^)/2. 


(3) = 


= (p + e-(^^+^)-)/2. 



(4.26) 

where p is an adjustable parameter. Here, the guideline we use to determine p is the same 
as we used to determine p in the the Swendsen-Wang algorithm for the Ising ferromagnet: 
we make the clusters as small as possible. In the present case, this means making i'(O) as 
small as possible. The range of p is given by 

0<p< min{e^, e^^^"^)^}. (4.27) 

Therefore, in the case of XY-type anisotropy, i.e., A > 1, we take p = e^ . The resulting 
labeling probability is given in Table I. From this Table, we note that the probability of 
assigning a break-up of type is zero. In other words, the resulting clusters are simple closed 
loops with no branches. 

On the other hand, in the case of Ising-like anisotropy, i.e., A < 1, we take p = e^^^~^''^. 
Hence, the labeling probability becomes the one shown in Table II. In this case, the branch- 
ing of loops is inevitable because p(l|0) = p(l|0) = 1 7^ 0. 

Next, we consider the antiferromagnetic case. In this case, we assign the break-up of type 
to the states 2 and 2 with finite probabilities (Fig. 3(b)), in contrast to the ferromagnetic 
case where we assigned it to the states 1 and I. The weight equation is 

^;(2) + ^;(3) = exp(-T), 
■u(O) + v{l) + v{3) = exp(T) cosh(2AT), 

^;(1) + v{2) = exp(T) sinh(2AT). (4.28) 

Its solution is 

v{0) = e-^-p, 

v{2) = {p- e-(^^+^)-)/2, 

^(3) = (p + e-(2^+i)-)/2, (4.29) 

where p is an adjustable parameter whose range is given by 

<p< minle"" cosh(2AT), e""" + e"" sinh(2AT)}. (4.30) 
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The maximal cluster number guideline suggests that we take 

p = e^cosh(2AT) (4.31) 

in the case of XK-like anisotropy and 

p = e-^ + e^sinh(2AT) (4.32) 

in the case of Ising-like anisotropy. As a result, the labeling probabilities become the ones 
shown in Table III and Table IV. 

We note that in both the ferromagnetic and the antiferromagnetic cases, the binding of 
two loops cannot be avoided for Ising-like anisotropy (A < 1). In other words, the clusters 
formed in the case of Ising-like models are not simple closed loops. Roughly speaking, the 
clusters in this case are "bulkier" than those in XK-like models and spread out more in 
the horizontal (real-space) direction. This situation is natural, when we note that in the 
extremal anisotropy case, i.e., in the case of purely classical Ising models, a cluster in the 
Swendsen-Wang algorithm occupies a wide region of the space. We can show that this naive 
relation between the present algorithm and the Swendsen-Wang algorithm can be stated 
more clearly as follows. 

When we take the classical limit, i.e., the limit of A ^ 0, the local weight factors in the 
ferromagnetic case are 

w[l) = w(l) ^ exp(T), 

w{2) = w(2) ^ exp(— r), 

w{3) = w{3)^0. (4.33) 

If we draw so-called worldlines by connecting the sites on which the local variables have the 
same value, these worldlines become straight lines because the probability of "bending" a 
worldline is proportional to w[3) and is vanishing. A straight worldline of I's corresponds 
an up-spin in the classical Ising model and a worldline of O's corresponds a down-spin. The 
non- vanishing labeling probabilities are 

p{0\l) = p{0\l) = 1 - e-'\ (4.34) 

p{3\l) = p{3\l) = e-'\ (4.35) 

K3|2)=K3|2) = 1. (4.36) 

Now we will consider the situation where r is small, and hence the Suzuki-Trotter ap- 
proximation is a good one and we are in the quantum limit. In such a case, p(0|l) ~ 2t and 
is much smaller than p(3|l) ~ 1 — 2t. Therefore, the labeling probabilities listed above indi- 
cate that we should assign the third label to almost all the plaquettes. In other words, any 
one of the resulting loops is almost identical to one of worldlines, except for some that are 
bound to each other with a small probability. Two kinds of loops are formed: ones for which 
the underlying worldlines are up-spins and the ones for which the underlying worldlines are 
down-spins. Two adjacent loops (i.e., straight lines) of the same kind are bound to each 
other at the plaquette between them with a probability 1 — e~^'^. As a result, the probability 
for two neighboring loops of the same kind not to be bound to each other becomes 
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-2T\m _ -2/3J 



(4.37) 



since there are m shaded plaquettes between two adjacent straight hnes. 

To summarize, in the classical limit, we assign a vertical line to each site in the original 
representation and bind two adjacent lines of the same kind with the probability 1 — e~'^^^ . 
This construction is exactly the same as the ordinary Swendsen-Wang algorithm for classical 
Ising models if we interpret worldlines of O's as down-spins and worldline of I's as up- 
spins. By a similar argument, we can show that the A ^ limit of our algorithm for the 
antiferromagnetic case reduces to the Swendsen-Wang algorithm for the antiferromagnetic 
Ising model. Thus, the present cluster algorithm is the extension of the Swendsen-Wang 
algorithm to quantum spin problems with S = 1/2. Although the critical slowing down in 
the quantum simulation of a 5" = 1/2 system has not extensively studied so far, obviously 
such a difficulty will exist. The present algorithm will be essential for reducing the difficulty 
in such simulations. 



V. CONCLUDING REMARKS 

We have discussed the development of cluster algorithms from the viewpoint of probabil- 
ity theory and not from the usual viewpoint of a particular model. One of our motivations 
was to define general procedures for constructing clusters that are independent of the effec- 
tive Hamiltonian and interaction concepts used by Kandel and Domany [4]. By using the 
perspective of probability theory, we clearly detailed the nature of a cluster algorithm, made 
explicit the assumptions embodied in all clusters of which we are aware, and defined the 
steps for the construction of a free cluster algorithms. We illustrated these procedures by 
rederiving the Swendsen-Wang algorithm, presenting the details of the loop algorithm for a 
worldline simulation of a quantum S = 1/2 model, and proposing a free cluster version of 
the replica method for the Ising glass. 

Within the perspective of probability theory, we emphasized defining the labeling scheme 
embodied by the function 6[a, b) and the fiipping weights v[b) of the clusters, as they are 
actually the only things that one needs, instead of specifying the labeling probabilities, 
which is often done. By this shift in emphasis, we showed that the development of a cluster 
algorithm reduces to the solution of a linear system of equations that is generally underde- 
termined. A solution to this linear system is not guaranteed, but we were always able to find 
at least one solution by adding additional labels, if necessary. When multiple free cluster 
solutions exists, if we choose the one that should produce the largest number of clusters, we 
then typically recover existing algorithms, like the Swendsen-Wang algorithm and can also 
develop new free cluster algorithms, like the one for the Ising glass (Section IV B). With this 
approach, the development of a cluster algorithm is reduced to picking a solution of these 
equations (after one has specified the labeling). 

We have presented the details of non-trivial examples of how one might obtain free cluster 
algorithms for several different systems, including a quantum mechanical problem and a spin- 
glass model. For these systems, the potential algorithms had a few free parameters, which 
were easily determined by the rule that as many clusters as possible should be formed. In 
other cases, they may be determined uniquely by eliminating a certain set of possible labels 
from consideration. For cases tested to date, we are receiving superior performance. In the 
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Appendix, we propose a more "black-box" approach: maximizing the information theory 
entropy under the constraints of normahzation and the hnear system. For the Ising model 
and the Swendsen-Wang labeling, we find an algorithm similar to Swendsen- Wang's method. 

A crucial condition for cluster algorithms developed so far appear to be the decompos- 
ability of H^(yl) into a product Hi w[ai). As we argued, this condition is not very restrictive. 
All classical systems we can think of satisfy it, but some methods for simulating quantum 
states do not. What our formalism, or any other one, leaves unspecified is the labeling. 

Specifying the labeling is the heart of the problem. Choosing labels that create clusters 
which approximate the large-scale coherent structures consistent with the actual physical 
behavior seems a natural thing to do. In general, this identification might be difficult because 
it requires understanding some aspects of the answer before attempting the solution. For 
some algorithms, like the methods for the 6-vertex model [11] and worldline method for 
fermions [12] and quantum spins [14], creating loops has proven to be effective and natural. 
The motivation for these latter algorithms was in part to maintain a local conservation 
condition. More recently, in a worldline method of general quantum Heisenberg spins [13], 
the importance of a conservation condition again appears. Loops, instead of clusters, appear. 
This appearance underscores once again the need for a generalized approach to cluster 
algorithms we presented: the presented approach is free from a specific model and concepts 
that might be unnatural for the problem at hand. 

For several systems, creating loops has been proven to be very effective. These systems 
include the work of Kandel, Ben-Av, and Domany [3] on the fully frustrated Ising model, 
the work of Evertz, Luna, and Marcu [11] on the 6-vertex model, and our recent work on 
fermion and quantum spin models. We recall our discussion of the S =1/2 XXZ model. In 
this model, for the procedures followed, whether we obtained a cluster or loop algorithm 
depended on the anisotropy. This result is surprising and illustrates that the goal of a cluster 
algorithm perhaps should not be constructing clusters or loops but rather creating whatever 
large-scale coherent structures that are convenient and effective. 

Besides the work of Kandel and Domany, we would like to acknowledge several other 
works that make points related to ours. In the context of bond percolation, Ising cluster 
dynamics, Tamayo and Brower [15] have remarked that the cluster process can be viewed 
as a Monte Carlo process involving a joint probability function for the Ising variables and 
labels. They also pointed out that the Swendsen-Wang algorithm was not a unique way 
to produce a cluster algorithm for the Ising model and proceeded to develop other free 
cluster algorithms, one of which they demonstrated had higher efficiency. Their procedures, 
derived form the Kandel-Domany perspective, also lead to an underdetermined system of 
equations. Choosing the flipping weights, and not the labeling probabilities, is implicit in 
the work of Tamayo and Brower and also in the work by Evertz, Luna and Marcu on the 
6-vertex model. It is explicit in the very recent work by Coddington and Han on the fully 
frustrated Ising model. In all these cases, the authors are lead to an underdetermined linear 
system of equations for which more than one acceptable solution exists. In developing the 
algorithms for the 6-vertex model, Evertz, Luna, and Marcu [11] used what they called 
"the principle of minimal freezing" to avoid the algorithm from producing unfavorable large 
clusters. Coddington and Han [16] also searched for solutions that avoid the production of 
large clusters. Thus, these workers, as ourselves, have demonstrated how to produce free 
cluster algorithms and possibly to avoid ones with unfavorable large clusters. We all have 
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avoided the pitfalls of naive generalizations of the Swendsen-Wang algorithm commented 
upon by Domany and Kandel [4]. The unsatisfied challenge is how to find the optimal 
algorithm. 
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APPENDIX 

Another approach for solving (3.18), and making it more of a "black-box" procedure, 
is to exploit the a priori knowledge that Pr(6) = v[b)/ J^b'^i^) i^ ^ probability and use the 
principle of Maximum Entropy to assign these probabilities [5]. With this approach, the 
problem reduces to maximizing 

g = - E Pr(&) In Pr(6) + AE Pr(6) - 1] + E Va[Pr{a) - ^ d{a, h) Pr(6)] (Al) 

b b a. b 

with respect to the Pr(6). In this equation, the A and rja are Lagrange multipliers and 

d{a,b) = 6{a,b)/n{b) (A2) 

where n[b) = X^b^(a, &). The first term in (Al) is the information-theory entropy term. In 
the absence of the remaining terms, maximization would result in the probabilities Pr(6) 
appearing with equal weight. The second term constrains the solution to be normalized, 
and the third term constrains the solution to satisfy the linear equation for (3.18). 
The maximization yields 

Pr(6) = e" E.''-^''.*')/ J2 e" E. "-<''.*') (A3) 

b 

and the rja satisfy the following set of non-linear equations: 

Pr(a) = E^(a,6)e-E."'.'<-'.*')/Ee-E.''<^<-.*') (A4) 

b b 

In general, these non-linear equations require numerical solution. The zero-field Ising 
model is simple enough that analytic solutions are possible. Using the results and definitions 
of Section IV A, we find for Pr(6) that 

Pr(l) = 1/(1 +r + r2) (A5) 

Pr(2) =rV(l +r + r2) (A6) 

Pr(3) =r/(l+r + r2) (A7) 

where as before r = e~'^^^ . For the Swendsen-Wang algorithm, one has from (4.8) 
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Pr(l) = l-r (A8) 

Pr(2) = (A9) 

Pr(3) = r (AlO) 

The two solutions thus approach one another in the low temperature limit. We have not 
made numerical tests of differences in computational efficiency. 
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TABLES 



TABLE I. The labeling probabilities for XY-like ferromagnets. 



b=0 



b=l 



b=2 



b=3 



-2(^+1; 



-2(^+1. 



a=l,l 
a=2,2 
a=3,3 



ill 



n 



(e2(^-l)T-i)e2T 

2 cosh(2Ar) 
(e2(^-l)T-i)e2T 

2 sinh(2Ar) 



2 



(_e2(^+l)T+i)e2T 
2 sinh(2Ar) 



(l±e 



n 



(e-2(^+l^+l)e2-^ 
2 cosh(2Ar) 





TABLE IL The labeling probabilities for Ising-like ferromagnets. 



b=0 



b=l 



b=2 



e-2(i-Ah 






b=3 



a=l,l 
a=2,2 
a=3,3 



6-2-^ sinh(2AT) 



6-2-^ cosh(2AT) 



TABLE in. The labeling probabilities for Xy-like antiferromagnets. 





b=0 


b=l 


b = 2 


b=3 


a=l,l 
a=2,2 
a=3,3 









2e2T cosh(2Ar) 

e2(l + ^)T_i 
2e2T smh(2Ar) 


(I_e2(l-A)r) 

2 



l_e2(l-^)-r 

2e2T smh(2Ar) 


(l+e2(l-A)r) 

l+e2(l-A)T 
2e2T cosh(2Ar) 





TABLE IV. The labeling probabilities for Ising-like antiferromagnets. 



b=0 



b=l 



b=2 



b=3 



a=l,l 
a=2,2 
a=3,3 





e2(l-A)T_i 

2-r cosh(2Ar) 







tanh(2AT) 
1 



cosh(2Ar) 
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FIGURES 

FIG. 1. The labels and the labeling probabilities for the Swendsen-Wang algorithm. Dashed 
lines in the leftmost column represent unsatisfied bonds; the solid lines, satisfied bonds. The upper 
item in each entry is the matrix element ^(a|6);the lower item, the labeling probability. 

FIG. 2. The matrix elements 6{a\b) and the labeling probabilities for the ergodic free-cluster, 
replica, Monte Carlo method for spin glass systems. The upper item in each entry is the matrix 
element 6{a\b); the lower item, the labeling probability. The horizontal lines (dashed or solid) in 
each diagram in the leftmost column represent bonds at the same location but in different replicas. 

FIG. 3. The labels and matrix elements 6{a\b) for the loop algorithm for (a) ferromagnetic and 
(b) antiferromagnetic quantum spin systems with S = 1/2. 
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